Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Segmentation and classification of mixed text/graphics/image documents

Identifieur interne : 002C57 ( Main/Exploration ); précédent : 002C56; suivant : 002C58

Segmentation and classification of mixed text/graphics/image documents

Auteurs : Kuo-Chin Fan [République populaire de Chine] ; Chi-Hwa Liu [République populaire de Chine] ; Yuan-Kai Wang [République populaire de Chine]

Source :

RBID : ISTEX:123A7198BEC7D9CD578696A38B6DD150816C6241

Abstract

In this paper, a feature-based document analysis system is presented which utilizes domain knowledge to segment and classify mixed text/graphics/image documents. In our approach, we first perform a run-length smearing operation followed by the stripe merging procedure to segment the blocks embedded in a document. The classification task is then performed based on the domain knowledge induced from the primitives associated with each type of medium. Proper use of domain knowledge is proved to be effective in accelerating the segmentation speed and decreasing the classification error. The experimental study reveals the feasibility of the new technique in segmenting and classifying mixed text/graphics/image documents.

Url:
DOI: 10.1016/0167-8655(94)90110-4


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title>Segmentation and classification of mixed text/graphics/image documents</title>
<author>
<name sortKey="Fan, Kuo Chin" sort="Fan, Kuo Chin" uniqKey="Fan K" first="Kuo-Chin" last="Fan">Kuo-Chin Fan</name>
</author>
<author>
<name sortKey="Liu, Chi Hwa" sort="Liu, Chi Hwa" uniqKey="Liu C" first="Chi-Hwa" last="Liu">Chi-Hwa Liu</name>
</author>
<author>
<name sortKey="Wang, Yuan Kai" sort="Wang, Yuan Kai" uniqKey="Wang Y" first="Yuan-Kai" last="Wang">Yuan-Kai Wang</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:123A7198BEC7D9CD578696A38B6DD150816C6241</idno>
<date when="1994" year="1994">1994</date>
<idno type="doi">10.1016/0167-8655(94)90110-4</idno>
<idno type="url">https://api.istex.fr/document/123A7198BEC7D9CD578696A38B6DD150816C6241/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000918</idno>
<idno type="wicri:Area/Istex/Curation">000908</idno>
<idno type="wicri:Area/Istex/Checkpoint">001F52</idno>
<idno type="wicri:doubleKey">0167-8655:1994:Fan K:segmentation:and:classification</idno>
<idno type="wicri:Area/Main/Merge">002E24</idno>
<idno type="wicri:Area/Main/Curation">002C57</idno>
<idno type="wicri:Area/Main/Exploration">002C57</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a">Segmentation and classification of mixed text/graphics/image documents</title>
<author>
<name sortKey="Fan, Kuo Chin" sort="Fan, Kuo Chin" uniqKey="Fan K" first="Kuo-Chin" last="Fan">Kuo-Chin Fan</name>
<affiliation wicri:level="1">
<country xml:lang="fr" wicri:curation="lc">République populaire de Chine</country>
<wicri:regionArea>Institute of Computer Science and Electronic Engineering, National Central University, Chung-Li, Taiwan</wicri:regionArea>
<wicri:noRegion>Taiwan</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Liu, Chi Hwa" sort="Liu, Chi Hwa" uniqKey="Liu C" first="Chi-Hwa" last="Liu">Chi-Hwa Liu</name>
<affiliation wicri:level="1">
<country xml:lang="fr" wicri:curation="lc">République populaire de Chine</country>
<wicri:regionArea>Institute of Computer Science and Electronic Engineering, National Central University, Chung-Li, Taiwan</wicri:regionArea>
<wicri:noRegion>Taiwan</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Wang, Yuan Kai" sort="Wang, Yuan Kai" uniqKey="Wang Y" first="Yuan-Kai" last="Wang">Yuan-Kai Wang</name>
<affiliation wicri:level="1">
<country xml:lang="fr" wicri:curation="lc">République populaire de Chine</country>
<wicri:regionArea>Institute of Computer Science and Electronic Engineering, National Central University, Chung-Li, Taiwan</wicri:regionArea>
<wicri:noRegion>Taiwan</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Pattern Recognition Letters</title>
<title level="j" type="abbrev">PATREC</title>
<idno type="ISSN">0167-8655</idno>
<imprint>
<publisher>ELSEVIER</publisher>
<date type="published" when="1993">1993</date>
<biblScope unit="volume">15</biblScope>
<biblScope unit="issue">12</biblScope>
<biblScope unit="page" from="1201">1201</biblScope>
<biblScope unit="page" to="1209">1209</biblScope>
</imprint>
<idno type="ISSN">0167-8655</idno>
</series>
<idno type="istex">123A7198BEC7D9CD578696A38B6DD150816C6241</idno>
<idno type="DOI">10.1016/0167-8655(94)90110-4</idno>
<idno type="PII">0167-8655(94)90110-4</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0167-8655</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">In this paper, a feature-based document analysis system is presented which utilizes domain knowledge to segment and classify mixed text/graphics/image documents. In our approach, we first perform a run-length smearing operation followed by the stripe merging procedure to segment the blocks embedded in a document. The classification task is then performed based on the domain knowledge induced from the primitives associated with each type of medium. Proper use of domain knowledge is proved to be effective in accelerating the segmentation speed and decreasing the classification error. The experimental study reveals the feasibility of the new technique in segmenting and classifying mixed text/graphics/image documents.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>République populaire de Chine</li>
</country>
</list>
<tree>
<country name="République populaire de Chine">
<noRegion>
<name sortKey="Fan, Kuo Chin" sort="Fan, Kuo Chin" uniqKey="Fan K" first="Kuo-Chin" last="Fan">Kuo-Chin Fan</name>
</noRegion>
<name sortKey="Liu, Chi Hwa" sort="Liu, Chi Hwa" uniqKey="Liu C" first="Chi-Hwa" last="Liu">Chi-Hwa Liu</name>
<name sortKey="Wang, Yuan Kai" sort="Wang, Yuan Kai" uniqKey="Wang Y" first="Yuan-Kai" last="Wang">Yuan-Kai Wang</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002C57 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002C57 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:123A7198BEC7D9CD578696A38B6DD150816C6241
   |texte=   Segmentation and classification of mixed text/graphics/image documents
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024